03 Caching Strategies
- Caching Strategies: Making Millions of Requests Feel Like Thousands
- Why Caching is Critical at Scale
- 1. Multi-Layer Caching Architecture
- Why Multiple Layers?
- 2. In-Memory Caching (IMemoryCache)
- Cache Stampede Protection (Critical!)
- 3. Distributed Caching with Redis
- Setup StackExchange.Redis
- Cache-Aside Pattern
- 4. Two-Layer Cache (L1 + L2)
- 5. Cache Invalidation Strategies
- Strategy 1: TTL with Jitter
- Strategy 2: Write-Through (Update cache on write)
- Strategy 3: Cache Invalidation via Events
- Strategy 4: Tag-Based Invalidation (Redis)
- 6. HTTP Caching (ETags & 304 Not Modified)
- 7. CDN for Static Content
- Summary: Caching Checklist
Caching Strategies: Making Millions of Requests Feel Like Thousands
Why Caching is Critical at Scale
The Math:
- Database query: 50ms average
- Redis cache hit: 1-2ms
- In-memory cache hit: 0.01ms (10 microseconds)
With millions of users, your database is the bottleneck. Caching moves the "hot path" from slow storage to fast memory.
Key Principle: The fastest request is one you never have to process.
---
1. Multi-Layer Caching Architecture
Use multiple cache layers for different access patterns:
Request → In-Memory (L1) → Redis (L2) → Database (L3)
↓ ↓ ↓ ↓
10μs 1-2ms 50ms 100ms+
Why Multiple Layers?
- L1 (In-Memory): Ultra-fast for per-instance hot data
- L2 (Redis): Shared across instances, survives restarts
- L3 (Database): Source of truth
---
2. In-Memory Caching (IMemoryCache)
Best for: Frequently accessed, rarely changing data per instance.
using Microsoft.Extensions.Caching.Memory;
public class ProductService
{
private readonly IMemoryCache _cache;
private readonly IProductRepository _repo;
private readonly ILogger<ProductService> _logger;
public ProductService(
IMemoryCache cache,
IProductRepository repo,
ILogger<ProductService> logger)
{
_cache = cache;
_repo = repo;
_logger = logger;
}
public async Task<Product> GetProductAsync(int id, CancellationToken ct)
{
var cacheKey = $"product:{id}";
// Try in-memory cache first
if (_cache.TryGetValue(cacheKey, out Product? cached))
{
_logger.LogDebug("Cache HIT for product {ProductId}", id);
return cached!;
}
_logger.LogDebug("Cache MISS for product {ProductId}", id);
// Load from database
var product = await _repo.GetByIdAsync(id, ct);
// Cache with options
var cacheOptions = new MemoryCacheEntryOptions
{
AbsoluteExpirationRelativeToNow = TimeSpan.FromMinutes(10),
SlidingExpiration = TimeSpan.FromMinutes(2), // Renews if accessed
Size = 1, // For size-based eviction
Priority = CacheItemPriority.Normal
};
// Add callback to log eviction
cacheOptions.RegisterPostEvictionCallback((key, value, reason, state) =>
{
_logger.LogDebug(
"Cache eviction: {Key}, Reason: {Reason}",
key, reason
);
});
_cache.Set(cacheKey, product, cacheOptions);
return product;
}
}
// Configure memory cache in Program.cs
builder.Services.AddMemoryCache(options =>
{
options.SizeLimit = 1024; // Limit number of entries
options.CompactionPercentage = 0.25; // Compact 25% when limit hit
});
Cache Stampede Protection (Critical!)
When cache expires and 1000 requests hit at once, they all query the DB. Use per-key locking.
public class StampededProofCacheService<T> where T : class
{
private readonly IMemoryCache _cache;
private static readonly ConcurrentDictionary<string, SemaphoreSlim> _locks = new();
private readonly ILogger<StampededProofCacheService<T>> _logger;
public StampededProofCacheService(
IMemoryCache cache,
ILogger<StampededProofCacheService<T>> logger)
{
_cache = cache;
_logger = logger;
}
public async Task<T> GetOrCreateAsync(
string key,
Func<CancellationToken, Task<T>> factory,
TimeSpan expiration,
CancellationToken ct)
{
// Fast path: cache hit
if (_cache.TryGetValue(key, out T? cached))
{
return cached!;
}
// Slow path: acquire per-key lock
var semaphore = _locks.GetOrAdd(key, _ => new SemaphoreSlim(1, 1));
await semaphore.WaitAsync(ct);
try
{
// Double-check after acquiring lock (another thread may have loaded it)
if (_cache.TryGetValue(key, out cached))
{
_logger.LogDebug("Cache hit after lock acquisition for {Key}", key);
return cached!;
}
_logger.LogDebug("Loading data for {Key}", key);
// Only one thread executes this
var value = await factory(ct);
// Add jitter to TTL to prevent thundering herd
var jitter = TimeSpan.FromSeconds(Random.Shared.Next(0, 30));
var ttl = expiration + jitter;
_cache.Set(key, value, new MemoryCacheEntryOptions
{
AbsoluteExpirationRelativeToNow = ttl
});
return value;
}
finally
{
semaphore.Release();
// Cleanup: remove lock if no waiters (prevent dictionary bloat)
if (semaphore.CurrentCount == 1 && _locks.TryRemove(key, out var removed))
{
removed.Dispose();
}
}
}
}
// Usage
public class UserService
{
private readonly StampededProofCacheService<User> _cache;
private readonly IUserRepository _repo;
public async Task<User> GetUserAsync(int userId, CancellationToken ct)
{
return await _cache.GetOrCreateAsync(
key: $"user:{userId}",
factory: async token => await _repo.GetByIdAsync(userId, token),
expiration: TimeSpan.FromMinutes(5),
ct: ct
);
}
}
Why this matters:
- Without locking: 1000 concurrent requests → 1000 DB queries
- With locking: 1000 concurrent requests → 1 DB query, 999 wait for result
---
3. Distributed Caching with Redis
Best for: Shared state across instances, session data, high-traffic reads.
Setup StackExchange.Redis
// Program.cs
builder.Services.AddSingleton<IConnectionMultiplexer>(sp =>
{
var config = ConfigurationOptions.Parse(
builder.Configuration.GetConnectionString("Redis")!
);
config.AbortOnConnectFail = false; // Retry on connect failure
config.ConnectTimeout = 5000;
config.SyncTimeout = 5000;
config.AsyncTimeout = 5000;
config.ConnectRetry = 3;
config.KeepAlive = 60;
// Connection pool settings
config.DefaultDatabase = 0;
var connection = ConnectionMultiplexer.Connect(config);
// Log connection events
connection.ConnectionFailed += (sender, args) =>
{
var logger = sp.GetRequiredService<ILogger<Program>>();
logger.LogError(
"Redis connection failed: {EndPoint}, {FailureType}",
args.EndPoint, args.FailureType
);
};
connection.ConnectionRestored += (sender, args) =>
{
var logger = sp.GetRequiredService<ILogger<Program>>();
logger.LogInformation("Redis connection restored: {EndPoint}", args.EndPoint);
};
return connection;
});
builder.Services.AddSingleton<IDistributedCache>(sp =>
{
var redis = sp.GetRequiredService<IConnectionMultiplexer>();
return new RedisCache(new RedisCacheOptions
{
ConnectionMultiplexerFactory = () => Task.FromResult(redis),
InstanceName = "MyApp:" // Prefix for all keys
});
});
Cache-Aside Pattern
public interface ICacheService
{
Task<T?> GetAsync<T>(string key, CancellationToken ct) where T : class;
Task SetAsync<T>(string key, T value, TimeSpan expiration, CancellationToken ct) where T : class;
Task RemoveAsync(string key, CancellationToken ct);
}
public class RedisCacheService : ICacheService
{
private readonly IDatabase _redis;
private readonly ILogger<RedisCacheService> _logger;
public RedisCacheService(
IConnectionMultiplexer multiplexer,
ILogger<RedisCacheService> logger)
{
_redis = multiplexer.GetDatabase();
_logger = logger;
}
public async Task<T?> GetAsync<T>(string key, CancellationToken ct) where T : class
{
try
{
var value = await _redis.StringGetAsync(key);
if (value.IsNullOrEmpty)
{
return null;
}
return JsonSerializer.Deserialize<T>(value!);
}
catch (Exception ex)
{
_logger.LogError(ex, "Redis GET failed for key {Key}", key);
return null; // Fail gracefully
}
}
public async Task SetAsync<T>(
string key,
T value,
TimeSpan expiration,
CancellationToken ct) where T : class
{
try
{
var json = JsonSerializer.Serialize(value);
await _redis.StringSetAsync(
key,
json,
expiration,
flags: CommandFlags.FireAndForget // Don't wait for confirmation
);
}
catch (Exception ex)
{
_logger.LogError(ex, "Redis SET failed for key {Key}", key);
// Don't throw - cache failures shouldn't break the app
}
}
public async Task RemoveAsync(string key, CancellationToken ct)
{
try
{
await _redis.KeyDeleteAsync(key, CommandFlags.FireAndForget);
}
catch (Exception ex)
{
_logger.LogError(ex, "Redis DELETE failed for key {Key}", key);
}
}
}
// Usage in service
public class OrderService
{
private readonly ICacheService _cache;
private readonly IOrderRepository _repo;
private readonly ILogger<OrderService> _logger;
public async Task<Order> GetOrderAsync(int orderId, CancellationToken ct)
{
var cacheKey = $"order:{orderId}";
// Try cache first
var cached = await _cache.GetAsync<Order>(cacheKey, ct);
if (cached != null)
{
_logger.LogDebug("Redis cache HIT for order {OrderId}", orderId);
return cached;
}
_logger.LogDebug("Redis cache MISS for order {OrderId}", orderId);
// Load from DB
var order = await _repo.GetByIdAsync(orderId, ct);
// Update cache (fire and forget)
_ = _cache.SetAsync(cacheKey, order, TimeSpan.FromMinutes(10), ct);
return order;
}
public async Task UpdateOrderAsync(Order order, CancellationToken ct)
{
// Update DB
await _repo.UpdateAsync(order, ct);
// Invalidate cache
await _cache.RemoveAsync($"order:{order.Id}", ct);
}
}
---
4. Two-Layer Cache (L1 + L2)
Combine in-memory and Redis for best performance.
public class TwoLayerCacheService
{
private readonly IMemoryCache _l1Cache;
private readonly IDatabase _redis;
private readonly ILogger<TwoLayerCacheService> _logger;
public TwoLayerCacheService(
IMemoryCache memoryCache,
IConnectionMultiplexer redis,
ILogger<TwoLayerCacheService> logger)
{
_l1Cache = memoryCache;
_redis = redis.GetDatabase();
_logger = logger;
}
public async Task<T?> GetAsync<T>(
string key,
Func<CancellationToken, Task<T>> factory,
TimeSpan expiration,
CancellationToken ct) where T : class
{
// L1 (in-memory) check
if (_l1Cache.TryGetValue(key, out T? l1Value))
{
_logger.LogDebug("L1 cache HIT for {Key}", key);
return l1Value;
}
// L2 (Redis) check
try
{
var redisValue = await _redis.StringGetAsync(key);
if (!redisValue.IsNullOrEmpty)
{
_logger.LogDebug("L2 cache HIT for {Key}", key);
var l2Value = JsonSerializer.Deserialize<T>(redisValue!);
// Backfill L1
_l1Cache.Set(key, l2Value, new MemoryCacheEntryOptions
{
AbsoluteExpirationRelativeToNow = TimeSpan.FromMinutes(1) // Shorter than L2
});
return l2Value;
}
}
catch (Exception ex)
{
_logger.LogWarning(ex, "L2 cache read failed for {Key}", key);
}
_logger.LogDebug("Cache MISS for {Key}, loading from source", key);
// Load from source
var value = await factory(ct);
// Write to both layers
_l1Cache.Set(key, value, new MemoryCacheEntryOptions
{
AbsoluteExpirationRelativeToNow = TimeSpan.FromMinutes(1)
});
try
{
var json = JsonSerializer.Serialize(value);
await _redis.StringSetAsync(key, json, expiration);
}
catch (Exception ex)
{
_logger.LogWarning(ex, "L2 cache write failed for {Key}", key);
}
return value;
}
}
Why two layers:
- L1 eliminates network latency for hot items
- L2 shares data across instances
- Survives instance restarts
- Best of both worlds
---
5. Cache Invalidation Strategies
The two hard problems in computer science:
- Cache invalidation
- Naming things
- Off-by-one errors
Strategy 1: TTL with Jitter
public static TimeSpan GetTTLWithJitter(TimeSpan baseTTL)
{
var jitterSeconds = Random.Shared.Next(0, (int)(baseTTL.TotalSeconds * 0.1));
return baseTTL + TimeSpan.FromSeconds(jitterSeconds);
}
// Usage
var ttl = GetTTLWithJitter(TimeSpan.FromMinutes(10)); // 10-11 minutes
await _cache.SetAsync(key, value, ttl, ct);
Why jitter: Prevents cache stampede when many entries expire simultaneously.
Strategy 2: Write-Through (Update cache on write)
public async Task UpdateUserAsync(User user, CancellationToken ct)
{
// Update database
await _repo.UpdateAsync(user, ct);
// Update cache immediately
var cacheKey = $"user:{user.Id}";
await _cache.SetAsync(cacheKey, user, TimeSpan.FromMinutes(10), ct);
}
Strategy 3: Cache Invalidation via Events
public class UserUpdatedEvent
{
public int UserId { get; set; }
}
public class UserCacheInvalidationHandler : INotificationHandler<UserUpdatedEvent>
{
private readonly ICacheService _cache;
public async Task Handle(UserUpdatedEvent notification, CancellationToken ct)
{
await _cache.RemoveAsync($"user:{notification.UserId}", ct);
// Could also remove related keys: $"user:{userId}:orders", etc.
}
}
Strategy 4: Tag-Based Invalidation (Redis)
public class TaggedCacheService
{
private readonly IDatabase _redis;
public async Task SetWithTagsAsync(
string key,
object value,
string[] tags,
TimeSpan expiration,
CancellationToken ct)
{
var json = JsonSerializer.Serialize(value);
// Store the value
await _redis.StringSetAsync(key, json, expiration);
// Store key in tag sets
foreach (var tag in tags)
{
await _redis.SetAddAsync($"tag:{tag}", key);
}
}
public async Task InvalidateByTagAsync(string tag, CancellationToken ct)
{
// Get all keys with this tag
var keys = await _redis.SetMembersAsync($"tag:{tag}");
// Delete all
foreach (var key in keys)
{
await _redis.KeyDeleteAsync(key.ToString());
}
// Remove the tag set
await _redis.KeyDeleteAsync($"tag:{tag}");
}
}
// Usage
await _taggedCache.SetWithTagsAsync(
"product:123",
product,
new[] { "products", "category:electronics", "brand:apple" },
TimeSpan.FromMinutes(10),
ct
);
// Invalidate all Apple products
await _taggedCache.InvalidateByTagAsync("brand:apple", ct);
---
6. HTTP Caching (ETags & 304 Not Modified)
Reduce bandwidth and processing for GET requests.
public class ProductsController : ControllerBase
{
private readonly IProductService _productService;
[HttpGet("{id}")]
[ResponseCache(Duration = 60, VaryByQueryKeys = new[] { "id" })]
public async Task<IActionResult> GetProductAsync(int id, CancellationToken ct)
{
var product = await _productService.GetProductAsync(id, ct);
// Generate ETag from product version or hash
var etag = $"\"{product.Version}\"";
Response.Headers.ETag = etag;
// Check if client has current version
if (Request.Headers.IfNoneMatch == etag)
{
return StatusCode(StatusCodes.Status304NotModified);
}
return Ok(product);
}
}
// Or use middleware for automatic ETag generation
public class ETagMiddleware
{
private readonly RequestDelegate _next;
public async Task InvokeAsync(HttpContext context)
{
var originalStream = context.Response.Body;
using var memoryStream = new MemoryStream();
context.Response.Body = memoryStream;
await _next(context);
if (context.Response.StatusCode == 200)
{
var hash = ComputeHash(memoryStream.ToArray());
var etag = $"\"{hash}\"";
context.Response.Headers.ETag = etag;
if (context.Request.Headers.IfNoneMatch == etag)
{
context.Response.StatusCode = 304;
context.Response.ContentLength = 0;
return;
}
}
memoryStream.Position = 0;
await memoryStream.CopyToAsync(originalStream);
}
private string ComputeHash(byte[] data)
{
using var sha256 = SHA256.Create();
var hash = sha256.ComputeHash(data);
return Convert.ToBase64String(hash);
}
}
---
7. CDN for Static Content
Serve static files (images, CSS, JS) from edge locations.
// Serve images with far-future expires
app.UseStaticFiles(new StaticFileOptions
{
OnPrepareResponse = ctx =>
{
// Cache for 1 year
ctx.Context.Response.Headers.CacheControl = "public,max-age=31536000";
// Use versioned URLs: /images/logo.v123.png
// When file changes, change version → new URL → cache busted
}
});
---
Summary: Caching Checklist
✅ Multi-layer caching: In-memory (L1) + Redis (L2) ✅ Stampede protection: Per-key locks for cache misses ✅ TTL with jitter: Prevent thundering herd ✅ Fail gracefully: Cache failures don't break the app ✅ Invalidation strategy: Write-through, events, or tag-based ✅ HTTP caching: ETags, 304 responses, response caching ✅ CDN: For static assets and cacheable API responses ✅ Monitor: Cache hit rate, eviction rate, memory usage
Key Insight: At scale, cache hit rate is everything. A 90% hit rate means 10x less database load. A 99% hit rate means 100x less load.
Next: Database Optimization & Scaling - When you do hit the database, make it fast.